A Platform for Evolving Genetic Automata for Text Segmentation (GNATS)
نویسنده
چکیده
Developers of large-scale document processing and image recognition systems are in need of a dynamically robust character segmentation component. Without this essential module, potential turn-key products will remain in the laboratory indefinitely. An experiment of evolving a biologically-based neural image processing system which has the ability to isolate characters within an unstructured text image is presented. In this study, organisms are simulated using a genetic algorithm with the goal of learning the intelligent behavior required for locating and consuming text image characters. Each artificial life-form is defined by a genotype containing a list of interdependent control parameters which contribute to specific functions of the organism. Control functions include vision, consumption, and movement. Using asexual reproduction in conjunction with random mutation, a domain independent solution for text segmentation is sought. For this experiment, an organism’s vision system utilizes a rectangular receptor field with signals accumulated using Gabor functions. The optimal subset of Gabor kernel functions for conducting character segmentation are determined through the process of evolution. From the results, two analyses are presented. A study of performance over evolved generations shows that qualifiers for the natural selection of dominant organisms increased 62%. The second analysis visually compares and discusses the variations of dominant genotypes from the first generation to the uniform genotypes resulting from the final generation.
منابع مشابه
Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملCluster-Based Image Segmentation Using Fuzzy Markov Random Field
Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992